In this short workbook, we outline how using R we can visualise and compare the various computations of the Palau TFR overtime. The data source is the World Population Prospects, United Nations (2022)
Below is the code used to build the scatter plot comparison.
First we load the tidyverse() library of R packages (Wickham et al. 2019), which includes the ggplot2() package (Wickham 2016), the dplyr() package (Wickham et al. 2022), and the plotly() package (Sievert 2020)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 1.0.1
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.5.0
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
##
## Attaching package: 'plotly'
##
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
##
## The following object is masked from 'package:stats':
##
## filter
##
##
## The following object is masked from 'package:graphics':
##
## layout
Next we load our .csv data into R as a tibble object. The tibble has four columns: “data_souce, estimate_method, estimated_tfr, estimated_year”
data = read_csv("data/tfr_estimates.csv") %>%
as_tibble()## Rows: 549 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): data_source, estimate_method
## dbl (2): estimated_tfr, estimated_year
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Finally, we use ggplot2() to compute a static data visualisation
# create the scatter plot
p = ggplot(data, aes(x = estimated_year,
y = estimated_tfr,
color = interaction(estimate_method,
data_source))) +
geom_point() +
geom_line() +
labs(x = "Estimated Year",
y = "Estimated TFR",
title = "Scatterplot of TFR Estimates of Palau 1950-2020",
color = "") +
theme_minimal() +
theme(legend.position = "bottom",
legend.box="vertical", legend.margin=margin()) +
guides(color=guide_legend(nrow=15, byrow=TRUE))
pUsing the ggplotly package , we can make the plot
# gonvert the ggplot to plotly
interactive = ggplotly(p) %>% layout(showlegend = FALSE)
interactive